A Conspiracy of Random X and Model Violation against Classical Inference in Linear Regression
نویسنده
چکیده
Following the econometric literature on model misspecification, we examine statistical inference for linear regression coefficients βj when the predictors are random and the linear model assumptions of first and/or second order are violated: E[Y |X1, ..., Xp] is not linear in the predictors and/or V [Y |X1, ..., Xp] is not constant. Such inference is meaningful if the linear model is seen as a useful approximation rather than part of a generative truth. A difficulty well-known in econometrics is that the required standard errors under random predictors and model violations can be very different from the conventional standard errors that are valid when the linear model is correct. The difference stems from a synergistic effect between model violations and randomness of the predictors. We show that asymptotically the ratios between correct and conventional standard errors can range between infinity and zero. Also, these ratios vary between predictors within the same multiple regression. This difficulty has consequences for statistics: It entails the breakdown of the classical ancillarity argument for predictors. If the assumptions of a generative regression model are violated, then the ancillarity argument for the predictors no longer holds, treating predictors as fixed is no longer valid, and standard inferences may lose their significance and confidence guarantees. The standard econometric solution for consistent inference under misspecification and random predictors is based on the “sandwich estimator” of the covariance matrix of β̂. A plausible alternative is the paired bootstrap which resamples predictors and response jointly. Discrepancies between conventional and bootstrap standard errors can be used as diagnostics for predictor-specific model violations, in analogy to econometric misspecification tests. The good news is that when model violations are sufficiently strong to invalidate conventional linear inference, their nature tends to be visible in graphical diagnostics.
منابع مشابه
Models as Approximations — A Conspiracy of Random Predictors and Model Violations Against Classical Inference in Regression
Abstract. We review and interpret the early insights of Halbert White who over thirty years ago inaugurated a form of statistical inference for regression models that is asymptotically correct even under “model misspecification,” that is, under the assumption that models are approximations rather than generative truths. This form of inference, which is pervasive in econometrics, relies on the “...
متن کاملThe Conspiracy of Random Predictors and Model Violations against Classical Inference in Regression
We review the early insights of Halbert White who over thirty years ago inaugurated a form of statistical inference for regression models that is asymptotically correct even under “model misspecification.” This form of inference, which is pervasive in econometrics, relies on the “sandwich estimator” or “heteroskedasticity-consistent estimator” of standard error. Whereas common practice in stati...
متن کاملBayesian Inference for Spatial Beta Generalized Linear Mixed Models
In some applications, the response variable assumes values in the unit interval. The standard linear regression model is not appropriate for modelling this type of data because the normality assumption is not met. Alternatively, the beta regression model has been introduced to analyze such observations. A beta distribution represents a flexible density family on (0, 1) interval that covers symm...
متن کاملModels as Approximations — A Conspiracy of Random Regressors and Model Misspecification Against Classical Inference in Regression
Abstract. More than thirty years ago Halbert White inaugurated a “modelrobust” form of statistical inference based on the “sandwich estimator” of standard error. This estimator is known to be “heteroskedasticityconsistent”, but it is less well-known to be “nonlinearity-consistent” as well. Nonlinearity raises fundamental issues because regressors are no longer ancillary, hence can’t be treated ...
متن کاملTitle : Bayesian Unimodal Density Regression for Causal Inference
Title Page Abstract Body Background: Karabatsos and Walker (2011) introduced a new Bayesian nonparametric (BNP) regression model. Through analyses of real and simulated data, they showed that the BNP regression model outperforms other parametric and nonparametric regression models of common use, in terms of predictive accuracy of the outcome (dependent) variable. The other, outperformed, regres...
متن کامل